Phylogenetic profiles reveal evolutionary relationships within the "twilight zone" of sequence similarity.

نویسندگان

  • Gue Su Chang
  • Yoojin Hong
  • Kyung Dae Ko
  • Gaurav Bhardwaj
  • Edward C Holmes
  • Randen L Patterson
  • Damian B van Rossum
چکیده

Inferring evolutionary relationships among highly divergent protein sequences is a daunting task. In particular, when pairwise sequence alignments between protein sequences fall <25% identity, the phylogenetic relationships among sequences cannot be estimated with statistical certainty. Here, we show that phylogenetic profiles generated with the Gestalt Domain Detection Algorithm-Basic Local Alignment Tool (GDDA-BLAST) are capable of deriving, ab initio, phylogenetic relationships for highly divergent proteins in a quantifiable and robust manner. Notably, the results from our computational case study of the highly divergent family of retroelements accord with previous estimates of their evolutionary relationships. Taken together, these data demonstrate that GDDA-BLAST provides an independent and powerful measure of evolutionary relationships that does not rely on potentially subjective sequence alignment. We demonstrate that evolutionary relationships can be measured with phylogenetic profiles, and therefore propose that these measurements can provide key insights into relationships among distantly related and/or rapidly evolving proteins.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Evolutionary and Phylogenetic Study of the BMP15 Gene

DNA sequence data contains a wealth of biologically useful information. Recent innovations in DNA sequencing technology have greatly increased our capacity to determine massive amounts of nucleotide sequences. These sequences can be used to specify the characteristics of different regions, interpret the evolutionary relationships between categorized groups, likelihood of performing multiple com...

متن کامل

Adaptive BLASTing through the Sequence Dataspace: Theories on Protein Sequence Embedding

A major computational challenge in the genomic era is annotating structure/function to the vast quantities of sequence information now available. This problem is illustrated by the fact that most proteins lack comprehensive annotation, even when experimental evidence exists. We theorized that phylogenetic profiles provide a quantitative method that can relate the structural and functional prope...

متن کامل

Comparative Phylogenetic Perspectives on the Evolutionary Relationships in the Brine Shrimp Artemia Leach, 1819 (Crustacea: Anostraca) Based on Secondary Structure of ITS1 Gene

This is the first study on phylogenetic relationships in the genus Artemia Leach, 1819 using the pattern and sequence of secondary structures of internal transcribed spacer 1 (ITS1). Significant intraspecific variation in the secondary structure of ITS1 rRNA was found in Artemia tibetiana. In the phylogenetic tree based on joined primary and secondary structure sequences, Artemia urmiana and pa...

متن کامل

Use of structural phylogenetic networks for classification of the ferritin-like superfamily.

In the postgenomic era, bioinformatic analysis of sequence similarity is an immensely powerful tool to gain insight into evolution and protein function. Over long evolutionary distances, however, sequence-based methods fail as the similarities become too low for phylogenetic analysis. Macromolecular structure generally appears better conserved than sequence, but clear models for how structure e...

متن کامل

Evolutionary Distances in the Twilight Zone—A Rational Kernel Approach

Phylogenetic tree reconstruction is traditionally based on multiple sequence alignments (MSAs) and heavily depends on the validity of this information bottleneck. With increasing sequence divergence, the quality of MSAs decays quickly. Alignment-free methods, on the other hand, are based on abstract string comparisons and avoid potential alignment problems. However, in general they are not biol...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Proceedings of the National Academy of Sciences of the United States of America

دوره 105 36  شماره 

صفحات  -

تاریخ انتشار 2008